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GenCore version 6.2.1 
Copyright (c) 1993 - 2008 Biocceleration Ltd. 



OM protein - protein search, using sw model 

Run on: June 24, 2008, 15:38:05 ; Search time 510 Seconds 

(without alignments) 
2506.345 Million cell updates/sec 

Title : US-10-552-515-l_COPY_157_933 
Perfect score: 4123 

Sequence: 1 QQDVQDGNTTVHYALLSASW SELSSHWTPFTVPKASQLQQ 77 7 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 5032670 seqs, 1645091341 residues 

Total number of hits satisfying chosen parameters: 19795 

Minimum DB seq length: 8 
Maximum DB seq length: 20 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : UniProt_12 . 1 : * 

1: uniprot_sprot : * 
2: uniprot_trembl : * 

Fred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 

1 44 1.1 20 2 Q594K2_PLAFA Q594k2 Plasmodium 

2 42 1.0 20 2 Q6BDK5_TRIMT Q6bdk5 tricholoma 

3 41 1.0 18 2 Q4XC48_PLACH Q4xc48 Plasmodium 
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ALIGNMENTS 



RESULT 1 

Q5 9 4K2_PLAFA 

ID Q594K2_PLAFA Unreviewed; 20 AA. 

AC Q594K2; 

DT 26-APR-2005, integrated into UniProtKB/TrEMBL . 
DT 26-APR-2005, sequence version 1. 
DT 24-JUL-2007, entry version 7. 

DE Digestive vacuole transmembrane protein (Fragment) . 
GN Name=CRT; 
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OS Plasmodium falciparum. 

OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Haemosporida; 

OC Plasmodium; Plasmodium (Laverania) . 

OX NCBI_TaxID=5833; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=GUY-PHG13; 

RA Best Plummer W., Pinto Pereira L.M., Carrington C.V.F.; 

RT "Pfcrt and Pfmdrl Alleles Associated with Chloroquine Resistance in 

RT Plasmodium falciparum from Guyana, South America."; 

RL Submitted (iyiAR-2004) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AY570260; AAU03451.1; -; Genomic_DNA. 

DR GO; GO: 0016021; C:integral to membrane; IEA:UniProtKB-KW. 

PE 4: Predicted; 

KW Transmembrane . 

FT NON_TER 1 1 

FT NON_TER 20 20 

SQ SEQUENCE 20 AA; 2349 MW; 99A32D09DD195484 CRC64; 



Query Match 1.1%; Score 44; DB 2; Length 20; 

Best Local Similarity 50.0%; Pred. No. 3.8e+04; 

Matches 9; Conservative 6; Mismatches 3; Indels 



Qy 399 VFILILSKIYVSLAHVLT 416 

: I I III I I : I : : : : I 
Db 1 IFIYILSIIYLSVSVMIT 18 



RESULT 2 
Q6BDK5_TRIMT 

ID Q6BDK5_TRIMT Unreviewed; 20 AA. 

AC Q6BDK5; 

DT 13-SEP-2004, integrated into UniProtKB/TrEMBL . 

DT 13-SEP-2004, sequence version 1. 

DT 24-JUL-2007, entry version 8. 

DE Putative uncharacterized protein (Fragment) . 

OS Tricholoma matsutake (Matsutake mushroom) (Tricholoma nauseosum) . 

OC Eukaryota; Fungi; Dikarya; Basidiomycota; Agar icomycotina; 

OC Agaricomycetes ; Agaricomycetidae; Agaricales; Tricholomataceae ; 

OC Tricholoma. 

OX NCBI_TaxID=40145; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Murata H. ; 

RT "Characterization of the insertion sites of marYl, the gypsy-type 

RT retrotransposon from the ectomycorrhizal basidiomycete Tricholoma 

RT matsutake strain Yl, in the genome the fungus based on the inter- 

RT retrotransposon amplified polymorphism analysis."; 

RL Submitted (JAN-2004) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 
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R EMBL; AB160895; BAD32671.1; -; Genomic_DNA. 

E 4: Predicted; 

T NON_TER 1 1 

T NON_TER 20 20 

Q SEQUENCE 20 AA; 2213 MW; 84BDB0AB47F6443C CRC64; 

Query Match 1.0%; Score 42; DB 2; Length 20; 

Best Local Similarity 52.9%; Pred. No. 5.6e+04; 

Matches 9; Conservative 1; Mismatches 7; Indels 

y 618 HLAVISNAFLLAFSSDF 634 

I : II I I I I I I 
b 1 HILGISKGILRVFSSDF 17 



RESULT 3 
Q4XC48_PLACH 

ID Q4XC48_PLACH Unreviewed; 18 AA. 

AC Q4XC48; 

DT 05-JUL-2005, integrated into UniProtKB/TrEMBL . 

DT 05-JUL-2005, sequence version 1. 

DT 24-JUL-2007, entry version 7. 

DE Putative uncharacterized protein (Fragment) . 

GN ORFNames=PC40356 7.00.0; 

OS Plasmodium chabaudi . 

OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Haemosporida; 

OC Plasmodium; Plasmodium (Vinckeia) . 

OX NCBI_TaxID=5825; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX PubMed=15637271; DOI=10 . 1126/science . 1103717; 

RA Hall N., Karras M., Raine J.D., Carlton J.M., Kooij T.W.A., 

RA Berriman M., Florens L., Janssen C.S., Pain A., Christophides G.K., 

RA James K., Rutherford K., Harris B., Harris D., Churcher CM., 

RA Quail M.A., Ormond D., Doggett J., Trueman H.E., Mendoza J., 

RA Bidwell S.L., Rajandream M.A,, Carucci D.J., Yates J.R. Ill, 

RA Kafatos F.C., Janse C.J,, Barrell B.C., Turner C.M.R., Waters A. P., 

RA Sinden R. S . ; 

RT "A comprehensive survey of the Plasmodium life cycle by genomic, 

RT transcriptomic, and proteomic analyses."; 

RL Science 307:82-86(2005). 

CC -!- CAUTION: The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ whole genome shotgun (WGS) entry which is 

CC preliminary data. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; CAAJO 1 0 0 7 713 ; CAH85524.1; -; Genomic_DNA. 

PE 4: Predicted; 

FT NON_TER 1 1 

SQ SEQUENCE 18 AA; 2011 MW; 674A05D1A9721915 CRC64; 

Query Match 1.0%; Score 41; DB 2; Length 18; 

Best Local Similarity 50.0%; Pred. No. 5.9e+04; 
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Matches 5; Conservative 4; Mismatches 1; Indels 0; Gaps 



Qy 607 GIWFHILAGL 616 

I : I I : I : I : 
Db 7 GVWFFVLSGI 16 



RESULT 4 
Q76N52_HUMAN 

ID Q76N52_HUMAN Unreviewed; 17 AA. 

AC Q76N52; 

DT 05-JUL-2004, integrated into UniProtKB/TrEMBL . 

DT 05-JUL-2004, sequence version 1. 

DT 24-JUL-2007, entry version 12. 

DE Ribosomal protein L41 (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Haplorrhini; 

OC Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 



RN 
RP 



NUCLEOTIDE SEQUENCE. 

MEDLINE=98248690; PubMed=9582194; 

Kenmochi N., Kawaguchi T., Rozen S., Davis E. 

Hudson T.J., Tanaka T., Page D.C.; 

"A map of 75 human ribosomal protein genes."; 

Genome Res. 8:5 09-523(199 8). 



Goodman N. 



Copyrighted by the UniProt Consortium, 
Distributed under the Creative Commons 



see http://www.uniprot.org/terms 
Attribution-NoDerivs License 



EMBL; AB007186; BAA28285.1; 
UniGene; Hs. 112553 
UniGene; Hs. 242947 
UniGene; Hs. 282998 
UniGene; Hs. 356799 
UniGene; Hs. 434890 
UniGene; Hs. 532082 
UniGene; Hs. 632703 
UniGene; Hs. 649959 
HGNC; HGNC:10354; RPL41. 
4: Predicted; 
Ribosomal protein. 
NON_TER 17 17 

SEQUENCE 17 AA; 2385 MW; 



Genomic_DNA. 



1990EBE3EEA7E344 CRC64; 



Query Match 0.9%; Score 39; DB 2; Length 17; 

Best Local Similarity 43.8%; Pred. No. 8.1e+04; 

Matches 7; Conservative 4; Mismatches 5; Indels 



5 09 LKGWWQKFRLRSKKRK 524 
: : I : I I : I III 
1 MRAKWRKKRMRRLKRK 16 



RESULT 5 
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Q5YKQ3_CONMU 

ID Q5YKQ3_CONMU Unrevlewed; 17 AA. 

AC Q5YKQ3; 

DT 23-NOV-2004, integrated into UniProtKB/TrEMBL . 

DT 23-NOV-2004, sequence version 1. 

DT 24-JUL-2007, entry version 8. 

DE Calmodulin (Fragment) . 

OS Conus mus (Mouse cone) . 

OC Eukaryota; Metazoa; Mollusca; Gastropoda; Orthogastropoda; 

OC Apogastropoda; Caenogastropoda; Sorbeoconcha; Hypsogastropoda; 

OC Neogastropoda; Conoidea; Conidae; Conus. 

OX NCBI_TaxID=257335; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Duda T . F . Jr . ; 

RT "Divergence in tropical seas: Global biogeography and evolutionary 

RT history of the marine gastropod genus Conus."; 

RL Submitted (SEP-2003) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AY382052; AAS01366.1; -; Genomic_DNA. 

DR GO; GO:0005509; F:calcium ion binding; lEA: InterPro . 

DR InterPro; IPR002048; EF_hand_Ca_bd . 

DR PROSITE; PS5 0222; EF_HAND_2 ; 1. 

PE 4: Predicted; 

FT NON_TER 1 1 

FT NON__TER 17 17 

SQ SEQUENCE 17 AA; 1835 MW; B6BEFE6AD2DD90F6 CRC64; 



Query Match 0.9%; Score 39; DB 2; Length 17; 

Best Local Similarity 66.7%; Pred. No. 8.1e+04; 

Matches 6; Conservative 2; Mismatches 1; Indels 0; Gaps 



6 DGNTTVHYA 14 

III I : I : I 
5 DGNGTIHFA 13 



RESULT 6 
CWP19_S0LLC 

ID CWP19_S0LLC Reviewed; 18 AA. 

AC P80815; 

DT 25-OCT-2005, integrated into UniProtKB/Swiss-Prot . 

DT 25-OCT-2005, sequence version 1. 

DT 24-JUL-2007, entry version 9. 

DE 76 kDa cell wall protein (Fragment) . 

OS Solanum lycopersicum (Tomato) (Lycopersicon esculentum) . 

OC Eukaryota; Viridiplantae ; Streptophyta; Embryophyta; Tracheophyta; 

OC Spermatophyta; Magnoliophyta; eudicotyledons ; core eudicotyledons ; 

OC asterids; lamiids; Solanales; Solanaceae; Solanoideae; Solaneae; 

OC Solanum; Lycopersicon. 

OX NCBI_TaxID=4081; 

RN [1] 

RP PROTEIN SEQUENCE, AND SUBCELLULAR LOCATION. 
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MEDLINE=9 73326 71; PubMed=9188482 ; DOI=10 . 10 74/ jbc . 2 72 . 25 . 15841 ; 
Robertson D., Mitchell G.P., Gilroy J.S., Gerrish C, Bolwell G.P., 
Slabas A.R. ; 

"Differential extraction and protein sequencing reveals major 
differences in patterns of primary cell wall proteins from plants."; 
J. Biol. Chem. 272:15841-15848(1997). 
-!- SUBCELLULAR LOCATION: Secreted, cell wall. 

Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
Distributed under the Creative Commons Attribution-NoDerivs License 



1: Evidence at protein level; 

Cell wall; Direct protein sequencing; Secreted. 

CHAIN 1 >18 76 kDa cell wall protein. 

/FTId=PRO_0000079688 . 



NON_TER 
SEQUENCE 



18 AA; 



18 



94 MW; 26676172F5F28409 CRC64; 



Query Match 0.9%; Score 39; DB 1; Length 18; 

Best Local Similarity 63.6%; Pred. No. 8.7e+04; 

Matches 7; Conservative 1; Mismatches 3; Indels 



80 KLPRFLGSDNQ 90 
: I III III 
3 RTPEFLGLDNQ 13 



RESULT 7 

Q6 22 56_MOUSE 

ID Q62256_MOUSE Unreviewed; 18 AA. 

AC Q62256; 

DT Ol-NOV-1996, integrated into UniProtKB/TrEMBL . 

DT Ol-NOV-1996, sequence version 1. 

DT 24-JUL-2007, entry version 20. 

DE Spermatogenic-specif ic proenkephalin . 

GN Name=Penk-rs ; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Rodentia; Sciurognathi; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=90287163; PubMed=2355920 ; 

RA Kilpatrick D.L., Zinn S.A., Fitzgerald M., Higuchi H., Sabol S.L., 

RA Meyerhardt J.; 

RT "Transcription of the rat and mouse proenkephalin genes is initiated 

RT at distinct sites in spermatogenic and somatic cells."; 

RL Mol. Cell. Biol. 10:3717-3726(1990). 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; M55181; AAA40127.1; -; mRNA. 

DR PIR; A35678; A35678. 

DR MGI; MGI: 104628; Penk-rs. 
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PE 4: Predicted; 

SQ SEQUENCE 18 AA; 2043 MW; B96E10CC7049FA76 CRC64; 

Query Match 0.9%; Score 39; DB 2 ; Length 18; 

Best Local Similarity 54.5%; Fred. No. 8.7e+04; 

Matches 6; Conservative 1; Mismatches 4; Indels 0; Gaps 

Qy 528 SAGASQGPWED 538 

I : I I I I I 

Db 2 SSGKQDSPWED 12 



RESULT 8 
A1Z5I9_HUMAN 

ID A1Z5I9_HUMAN Unreviewed; 19 AA. 

AC A1Z5I9; 

DT 06-FEB-2007, integrated into UniProtKB/TrEMBL . 

DT 06-FEB-2007, sequence version 1. 

DT 24-JUL-2007, entry version 2. 

DE Mediator of DNA damage checkpoint 1 variant 1 (Fragment) . 

GN Name=MDCl; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Primates; Haplorrhini; 

OC Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606 ; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RA Bu Y., Ozaki T., Suenaga Y., Nakanishi M., Kami jo T., Song F., 

RA Nakagawara A.; 

RT "Identification and characterization of human NFBDl promoter."; 

RL Submitted (DEC-2006) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; EF177823; ABM47421.1; -; mRNA. 

PE 4: Predicted; 

FT NON_TER 19 19 

SQ SEQUENCE 19 AA; 2326 MW; 0200C23525665A0E CRC64; 

Query Match 0.9%; Score 39; DB 2; Length 19; 

Best Local Similarity 43.8%; Pred. No. 9.3e+04; 

Matches 7; Conservative 3; Mismatches 6; Indels 0; Gaps 

Qy 291 TLAYRWDCSDYEDTEE 306 

II II : I : I I : 
Db 4 TQAIDWDVEEEEETEQ 19 



RESULT 9 
A2CIY7_RABIT 

ID A2CIY7_RABIT Unreviewed; 19 AA. 

AC A2CIY7; 

DT 20-FEB-2007, integrated into UniProtKB/TrEMBL. 
DT 20-FEB-2007, sequence version 1. 
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DT 24-JUL-2007, entry version 3. 

DE 0-mannosyl N-acetylglucosaminyltransf erase (Fragment) . 

OS Oryctolagus cuniculus (Rabbit) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires ; Glires; Lagomorpha; Leporidae; 

OC Oryctolagus. 

OX NCBI_TaxID=9986; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX PubMed=17345675; 

RA Farwick A., Jordan U., Fuellen G., Huchon D., Catzeflis F., 

RA Brosius J., Schmitz J.; 

RT "Automated scanning for phylogenetically informative transposed 

RT elements in rodents."; 

RL Syst. Biol. 55:936-948(2006). 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; DQ451084; ABE41717.1; -; Genomic_DNA. 

DR GO; GO:0016757; F : transferase activity, transferring glycosyl. . .; IEA:UniProtKB-KW. 

PE 4: Predicted; 

KW Glycosyltransf erase; Transferase. 

FT NON_TER 1 1 

FT NON_TER 19 19 

SQ SEQUENCE 19 AA; 2202 MW; D7445A6 1F812B998 CRC64; 



Query Match 0.9%; Score 39; DB 2; Length 19 

Best Local Similarity 50.0%; Pred. No. 9.3e+04; 

Matches 7; Conservative 2; Mismatches 3; Indels 



Qy 16 7 WGKWN — KYQPLDH 178 

I I I I : : I I I 
Db 3 WGTWNVDEAEVLDH 16 



RESULT 10 
Q96RQ2_HUMAN 

ID Q96RQ2_HUMAN Unreviewed; 20 AA. 

AC Q96RQ2; 

DT Ol-DEC-2001, integrated into UniProtKB/TrEMBL . 

DT Ol-DEC-2001, sequence version 1. 

DT 24-JUL-2007, entry version 13. 

DE Natural killer cell receptor 2B4 (Fragment) . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Haplorrhini; 

OC Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606 ; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=21240684; PubMed=11342640; 

RA Chuang S.S., Pham H.T., Kumaresan P.R., Mathew P. A.; 

RT "A prominent role for activator protein-1 in the transcription of the 

RT human 2B4 (CD244) gene in NK cells."; 

RL J. Immunol. 166:6188-6195(2001). 
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cc 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AF297616; AAK57926.1; -; Genomic_DNA. 

DR UniGene; Hs. 157872; -. 

DR HGNC; HGNC:18171; CD244. 

DR GO; GO: 0004872; F:receptor activity; IEA:UniProtKB-KW. 

PE 4: Predicted; 

KW Receptor. 

FT NON_TER 20 20 

SQ SEQUENCE 20 AA; 2243 MW; EBF997A9C0CF71EB CRC64; 



Query Match 0.9%; Score 3 8.5; DB 2 ; 

Best Local Similarity 44.4%; Fred. No. l.le+05; 
Matches 8; Conservative 5; Mismatches 4; 



Qy 391 LTGSVVNLVFILILSKIY 408 

: I II I : : I : I I : I 
Db 1 MLGQWTLILLLLL-KVY 17 



RESULT 11 
Q4XES0_PLACH 

ID Q4XES0_PLACH Unreviewed; 20 AA. 

AC Q4XES0; 

DT 05-JUL-2005, integrated into UniProtKB/TrEMBL . 

DT 05-JUL-2005, sequence version 1. 

DT 24-JUL-2007, entry version 7. 

DE Putative uncharacterized protein (Fragment) . 

GN ORFNames=PC402444.00.0; 

OS Plasmodium chabaudi . 

OC Eukaryota; Alveolata; Apicomplexa; Aconoidasida; Haemosporida; 

OC Plasmodium; Plasmodium (Vinckeia) . 

OX NCBI_TaxID=5825; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX PubMed=15637271; DOI=10 . 1126/science . 1103717; 

RA Hall N., Karras M., Raine J.D,, Carlton J.M., Kooij T.W.A., 

RA Berriman M., Florens L,, Janssen C.S,, Pain A., Christophides G.K., 

RA James K., Rutherford K., Harris B., Harris D., Churcher CM., 

RA Quail M.A., Ormond D., Doggett J,, Trueman H.E., Mendoza J., 

RA Bidwell S.L., Rajandream M.A., Carucci D.J., Yates J.R. Ill, 

RA Kafatos F.C., Janse C.J., Barrell B.G., Turner C.M.R., Waters A. P., 

RA Sinden R. S . ; 

RT "A comprehensive survey of the Plasmodium life cycle by genomic, 

RT transcriptomic, and proteomic analyses."; 

RL Science 307:82-86(2005). 

CC -!- CAUTION: The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ whole genome shotgun (WGS) entry which is 

CC preliminary data. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; CAAJ01006 918 ; CAH84598.1; -; Genomic_DNA. 
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PE 4: Predicted; 

FT NON_TER 1 1 

SQ SEQUENCE 20 AA; 2329 MW; FA3AE53FC706E353 CRC64; 

Query Match 0.9%; Score 38.5; DB 2; Length 20; 

Best Local Similarity 50.0%; Pred. No. l.le+05; 

Matches 10; Conservative 3; Mismatches 6; Indels 1; Gaps 1; 

Qy 432 TLKVFIFQFVNFYSSPVYIA 451 

I I I I I : I I : I : I I 
Db 1 TQKGFIFKF-NMFGLPLKIA 19 



RESULT 12 
LPW_EC05 7 

ID LPW_EC057 Reviewed; 14 AA. 

AC P0AD94; P03053; 

DT 21-JUL-1986, integrated into UniProtKB/Swiss-Prot . 

DT 21-JUL-1986, sequence version 1. 

DT 24-JUL-2007, entry version 11. 

DE Trp operon leader peptide . 

GN Name=trpL; OrderedLocusNames=Z2545, ECsl837; 

OS Escherichia coli 0157:H7. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales ; 

DC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=83334; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=0157 :H7 / EDL933 / ATCC 700927 / EHEC; 

RX MEDLINE=21074935; PubMed=ll 2 0 6 5 5 1 ; DOI=10 . 1038 /3505 40 8 9 ; 

RA Perna N.T., Plunkett G. Ill, Burland V., Mau B., Glasner J.D., 

RA Rose D.J., Mayhew G.F., Evans P.S., Gregor J., Kirkpatrick H.A., 

RA Posfai G., Hackett J., Klink S., Boutin A., Shao Y., Miller L., 

RA Grotbeck E.J., Davis N.W., Lim A., Dimalanta E.T., Potamousis K., 

RA Apodaca J., Anantharaman T.S., Lin J., Yen G., Schwartz D.C., 

RA Welch R.A., Blattner F.R.; 

RT "Genome sequence of enterohaemorrhagic Escherichia coli 0157 :H7."; 

RL Nature 409:529-533(2001). 

RN [2] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=0157 :H7 / Sakai / RIMD 0509952 / EHEC; 

RX MEDLINE=21156231; PubMed=11258796 ; DOI=10 . 1093/dnares/8 . 1 . 11; 

RA Hayashi T., Makino K., Ohnishi M., Kurokawa K., Ishii K., Yokoyama K., 

RA Han C.-G., Ohtsubo E., Nakayama K., Murata T., Tanaka M., Tobe T., 

RA lida T., Takami H., Honda T., Sasakawa C, Ogasawara N., Yasunaga T., 

RA Kuhara S., Shiba T., Hattori M., Shinagawa H.; 

RT "Complete genome sequence of enterohemorrhagic Escherichia coli 

RT 015 7 :H7 and genomic comparison with a laboratory strain K-12."; 

RL DNA Res. 8:11-22(2001). 

CC -!- FUNCTION: This protein is involved in control of the biosynthesis 
CC of tryptophan (By similarity) . 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AE005174; AAG56550.1; -; Genomic_DNA. 
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DR EMBL; BA000007; BAB35260.1; -; Genomic_DNA. 

DR PIR; B85761; B85761. 

DR PIR; E90858; E90858. 

DR GenomeReviews; BA000007_GR; ECsl837. 

DR GenomeReviews; AE005174_GR; Z2545. 

DR KEGG; ece:Z2545; -. 

DR KEGG; ecs:ECsl837; -. 

DR InterPro; IPR013205; Leader_Trp_op . 

DR Pfam; PF08255; Leader_Trp; 1. 

PE 3: Inferred from homology; 

KW Amino-acid biosynthesis; Aromatic amino acid biosynthesis; 

KW Complete proteome; Leader peptide; Tryptophan biosynthesis. 

FT PEPTIDE 1 14 Trp operon leader peptide. 

FT /FTId=PRO_0000044024 . 

SQ SEQUENCE 14 AA; 1723 MW; 5B79306E3E804A37 CRC64; 



Query Match 0.9%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 

Qy 509 LKGWWQ 514 

Mill: 

Db 7 LKGWWR 12 



Score 38; DB 1; Length 14; 
Pred. No. 7.6e+04; 
1; Mismatches 0; Indels 0; Gaps 



RESULT 13 

LPW__EC0L6 

ID LPW_EC0L6 Reviewed; 14 AA. 

AC P0AD93; P03053; 

DT 21-JUL-1986, integrated into UniProtKB/Swiss-Prot . 

DT 21-JUL-1986, sequence version 1. 

DT 21-AUG-2007, entry version 13. 

DE Trp operon leader peptide . 

GN Name=trpL; OrderedLocusNames=c5494; 

OS Escherichia coli 06. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales ; 

OC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=217992; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=06:H1 / CFT073 / ATCC 700928 / UPEC; 

RX MEDLINE=22388234; PubMed=12 4 71 1 5 7 ; DOI=10 . 1073/pnas . 252529799; 

RA Welch R.A., Burland V., Plunkett G. Ill, Redford P., Roesch P., 

RA Rasko D., Buckles E.L., Liou S.-R., Boutin A., Hackett J., Stroud D., 

RA Mayhew G.F., Rose D.J., Zhou S., Schwartz B.C., Perna N.T., 

RA Mobley H.L.T., Donnenberg M.S., Blattner F.R.; 

RT "Extensive mosaic structure revealed by the complete genome sequence 

RT of uropathogenic Escherichia coli."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:17020-17024(2002). 

CC -!- FUNCTION: This protein is involved in control of the biosynthesis 
CC of tryptophan (By similarity) . 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AE014075; AAN80196.1; -; Genomic_DNA. 
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DR GenomeRevlews; AE014075_GR; c_5494. 

DR KEGG; ecc:c5494; -. 

DR InterPro; IPR013205; Leader_Trp_op . 

DR Pfam; PF08255; Leader_Trp; 1. 

PE 3: Inferred from homology; 

KW Amino-acid biosynthesis; Aromatic amino acid biosynthesis; 

KW Complete proteome; Leader peptide; Tryptophan biosynthesis. 

FT PEPTIDE 1 14 Trp operon leader peptide. 

FT /FTId=PRO_0000044025 . 

SQ SEQUENCE 14 AA; 1723 MW; 5B79306E3E804A37 CRC64; 



Query Match 0.9%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 

Qy 509 LKGWWQ 514 

Mill: 

Db 7 LKGWWR 12 



Score 38; DB 1; Length 14; 
Pred. No. 7.6e+04; 
1; Mismatches 0; Indels 0; Gaps 



RESULT 14 
LPW_ECOLI 

ID LPW_ECOLI Reviewed; 14 AA. 

AC P0AD92; P03053; Q2MBG1; 

DT 21-JUL-1986, integrated into UniProtKB/Swiss-Prot . 

DT 21-JUL-1986, sequence version 1. 

DT 24-JUL-2007, entry version 14. 

DE Trp operon leader peptide. 

GN Name=trpL; Synonyms=trpEE ; OrderedLocusNames=bl265, JW1257; 

OS Escherichia coli. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales ; 

OC Enterobacteriaceae; Escherichia. 

OX NCBI_TaxID=562; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] . 

RX MEDLINE=82150258; PubMed=7038627; DOI=10 . 1093/nar/9 . 24 . 6647; 

RA Yanofsky C, Piatt T., Crawford I. P., Nichols B.P., Christie G.E., 

RA Horowitz H., van Cleemput M., Wu A.M.; 

RT "The complete nucleotide sequence of the tryptophan operon of 

RT Escherichia coli."; 

RL Nucleic Acids Res. 9:6647-6668(1981). 

RN [2] 

RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] . 

RX MEDLINE=76240562; PubMed=781271; DOI=10 . 1016/0022-2836 ( 76 ) 90317-X; 

RA Squires C, Lee F., Bertrand K., Squires C.L., Bronson M.J., 

RA Yanofsky C. ; 

RT "Nucleotide sequence of the 5' end of tryptophan messenger RNA of 

RT Escherichia coli."; 

RL J. Mol. Biol. 103:351-381(1976). 

RN [3] 

RP NUCLEOTIDE SEQUENCE [GENOMIC DNA] . 

RX MEDLINE=80101455; PubMed=l 1 8 45 1 ; 

RA Oxender D.L., Zurawski G., Yanofsky C; 

RT "Attenuation in the Escherichia coli tryptophan operon: role of RNA 

RT secondary structure involving the tryptophan codon region."; 

RL Proc. Natl. Acad. Sci. U.S.A. 76:5524-5528(1979). 
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RN [4] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=K12 / MG1655 / ATCC 47076; 

RX MEDLINE=97426617; PubMed=9278503; DOI=10 . 1126/science . 277 . 5331 . 1453; 

RA Blattner F.R., Plunkett G. Ill, Bloch C.A., Perna N.T., Burland V., 

RA Riley M., Collado-Vides J., Glasner J.D., Rode C.K., Mayhew G.F., 

RA Gregor J., Davis N.W., Kirkpatrick H.A., Goeden M.A., Rose D.J., 

RA Mau B . , Shao Y . ; 

RT "The complete genome sequence of Escherichia coli K-12."; 

RL Science 277:1453-1474(1997) . 

RN [5] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=K12 / W3110 / ATCC 27325 / DSM 5911; 

RX PubMed=16738553; DOI=10 . 1038/msb4100049; 

RA Hayashi K., Morooka N., Yamamoto Y., Fujita K., Isono K., Choi S., 

RA Ohtsubo E., Baba T., Wanner B.L., Mori H., Horiuchi T.; 

RT "Highly accurate genome sequences of Escherichia coli K-12 strains 

RT MG1655 and W3110."; 

RL Mol. Syst. Biol. 2 : E1-E5 ( 2006 ) . 

RN [6] 

RP STRUCTURE BY NMR. 

RX MEDLINE=94089403; PubMed=7505428 ; DOI=10 . 1093/nar/21 . 23 . 5485 ; 

RA Ramesh V. ; 

RT "NMR evidence for the RNA stem-loop structure involved in the 

RT transcription attenuation of E. coli trp operon."; 

RL Nucleic Acids Res. 21:5485-5488(1993). 

CC -!- FUNCTION: This protein is involved in control of the biosynthesis 
CC of tryptophan. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; J01714; AAA57296.1; -; Genomic_DNA. 

DR EMBL; U00096; AAC74347.1; -; Genomic_DNA. 

DR EMBL; AP009048; BAE76395.1; -; Genomic_DNA. 

DR PIR; A03589; LFECW. 

DR GenomeReviews; U00096_GR; bl265. 

DR GenomeReviews; AP009048_GR; JW1257. 

DR KEGG; ecj:JW1257; -. 

DR KEGG; eco:bl265; -. 

DR EchoBASE; EB1252; -. 

DR EcoGene; EG11274; trpL . 

DR BioCyc; EcoCyc : EG11274-MONOMER; -. 

DR InterPro; IPR013205; Leader_Trp_op . 

DR Pfam; PF08255; Leader_Trp; 1. 

PE 1: Evidence at protein level; 

KW Amino-acid biosynthesis; Aromatic amino acid biosynthesis; 

KW Complete proteome; Leader peptide; Tryptophan biosynthesis. 

FT PEPTIDE 1 14 Trp operon leader peptide. 

FT /FTId=PRO_0000044023 . 

SQ SEQUENCE 14 AA; 1723 MW; 5B79306E3E804A37 CRC64; 



Query Match 0.9%; Score 38; DB 1; Length 14; 

Best Local Similarity 83.3%; Pred. No. 7.6e+04; 

Matches 5; Conservative 1; Mismatches 0; Indels 0; Gaps 
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Qy 509 LKGWWQ 514 

Mill: 

Db 7 LKGWWR 12 



RESULT 15 
LPW_SHIFL 

ID LPW_SHIFL Reviewed; 14 AA. 

AC P0AD95; P03053; 

DT 21-JUL-1986, integrated into UniProtKB/Swiss-Prot . 

DT 21-JUL-1986, sequence version 1. 

DT 21-AUG-2007, entry version 15. 

DE Trp operon leader peptide. 

GN Name=trpL; OrderedLocusNames=SF1268, S4805; 

OS Shigella flexneri. 

OC Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales ; 

OC Enterobacteriaceae; Shigella. 

OX NCBI_TaxID=623; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=301 / Serotype 2a; 

RX MEDLINE=22272406; PubMed=12384590 ; DOI=10 . 1093/nar/gkf 566 ; 

RA Jin Q., Yuan Z., Xu J., Wang Y., Shen Y., Lu W., Wang J., Liu H., 

RA Yang J., Yang F., Zhang X., Zhang J., Yang G., Wu H., Qu D., Dong J., 

RA Sun L., Xue Y., Zhao A., Gao Y., Zhu J., Kan B., Ding K., Chen S., 

RA Cheng H., Yao Z., He B., Chen R., Ma D., Qiang B., Wen Y., Hou Y., 

RA Yu J . ; 

RT "Genome sequence of Shigella flexneri 2a: insights into pathogenicity 

RT through comparison with genomes of Escherichia coli K12 and 0157."; 

RL Nucleic Acids Res. 30:4432-4441(2002). 

RN [ 2 ] 

RP NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA] . 

RC STRAIN=ATCC 700930 / 2457T / Serotype 2a; 

RX MEDLINE=22590274; PubMed=12704152; 

RX DOI=10.1128/IAI. 71. 5. 2775-2786. 2003; 

RA Wei J., Goldberg M.B., Burland V., Venkatesan M.M., Deng W., 

RA Fournier G., Mayhew G.F., Plunkett G. Ill, Rose D.J., Darling A., 

RA Mau B., Perna N.T,, Payne S.M,, Runyen-Janecky L.J., Zhou S., 

RA Schwartz D.C., Blattner F.R,; 

RT "Complete genome sequence and comparative genomics of Shigella 

RT flexneri serotype 2a strain 2457T,"; 

RL Infect. Immun. 71:2775-2786(2003). 

CC -!- FUNCTION: This protein is involved in control of the biosynthesis 
CC of tryptophan (By similarity) . 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AE0056 74; AAN42881.1; -; Genomic_DNA. 

DR EMBL; AE014073; AAP16766.1; -; Genomic_DNA. 

DR GenomeReviews; AE014073_GR; S_4805. 

DR GenomeReviews; AE005674_GR; SF1268. 

DR KEGG; sfl:SF1268; -. 

DR KEGG; sfx:S4805; -. 

DR InterPro; IPR013205; Leader_Trp_op . 

DR Pfam; PF08255; Leader_Trp; 1. 
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3: Inferred from homology; 

Amino-acid biosynthesis; Aromatic amino acid biosynthesis; 
Complete proteome; Leader peptide; Tryptophan biosynthesis. 
PEPTIDE 1 14 Trp operon leader peptide. 

/FTId=PRO_0000044029 . 
SEQUENCE 14 AA; 1723 MW; 5B79306E3E804A37 CRC64; 



Query Match 0.9%; 
Best Local Similarity 83.3%; 
Matches 5; Conservative 



Score 38; DB 1; L 
Pred. No. 7.6e+04; 
L; Mismatches 0; 



509 LKGWWQ 514 
Mill: 
7 LKGWWR 12 



Search completed: June 24, 2008, 15:46:46 
Job time : 513 sees 
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